首页> 外文OA文献 >Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified
【2h】

Assessment of methods for amino acid matrix selection and their use on empirical data shows that ad hoc assumptions for choice of matrix are not justified

机译:对氨基酸基质选择方法的评估及其在经验数据中的应用表明,选择基质的临时假设是不合理的

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Background: In recent years, model based approaches such as maximum likelihood have become the methods of choice for constructing phylogenies. A number of authors have shown the importance of using adequate substitution models in order to produce accurate phylogenies. In the past, many empirical models of amino acid substitution have been derived using a variety of different methods and protein datasets. These matrices are normally used as surrogates, rather than deriving the maximum likelihood model from the dataset being examined. With few exceptions, selection between alternative matrices has been carried out in an ad hoc manner.Results: We start by highlighting the potential dangers of arbitrarily choosing protein models by demonstrating an empirical example where a single alignment can produce two topologically different and strongly supported phylogenies using two different arbitrarily-chosen amino acid substitution models. We demonstrate that in simple simulations, statistical methods of model selection are indeed robust and likely to be useful for protein model selection. We have investigated patterns of amino acid substitution among homologous sequences from the three Domains of life and our results show that no single amino acid matrix is optimal for any of the datasets. Perhaps most interestingly, we demonstrate that for two large datasets derived from the proteobacteria and archaea, one of the most favored models in both datasets is a model that was originally derived from retroviral Pol proteins.Conclusion: This demonstrates that choosing protein models based on their source or method of construction may not be appropriate.
机译:背景:近年来,基于模型的方法(例如最大似然法)已成为构建系统发育的选择方法。许多作者已经表明了使用适当的替换模型以产生准确的系统发育的重要性。过去,已经使用多种不同的方法和蛋白质数据集推导了许多氨基酸取代的经验模型。这些矩阵通常用作代理,而不是从要检查的数据集中得出最大似然模型。除少数例外,替代矩阵之间的选择是临时进行的。结果:我们首先通过演示一个经验示例来强调任意选择蛋白质模型的潜在危险,在该示例中,单个比对可以产生两个在拓扑上不同且受到强烈支持的系统发育树使用两个不同的任意选择的氨基酸替代模型。我们证明,在简单的模拟中,模型选择的统计方法的确确实可靠,并且可能对蛋白质模型选择有用。我们研究了来自生命的三个域的同源序列之间氨基酸取代的模式,我们的结果表明,对于任何数据集,没有一个单一的氨基酸矩阵是最佳的。也许最有趣的是,我们证明了对于源自变形杆菌和古细菌的两个大型数据集,这两个数据集中最受欢迎的模型之一是最初源自逆转录病毒Pol蛋白的模型。结论:这表明基于蛋白质模型选择蛋白质模型来源或构造方法可能不合适。

著录项

  • 作者

    Mcinerney, James;

  • 作者单位
  • 年度 2006
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号